Overview

Dataset Statistics

Number of Variables 18
Number of Rows 420768
Missing Cells 74027
Missing Cells (%) 1.0%
Duplicate Rows 0
Duplicate Rows (%) 0.0%
Total Size in Memory 101.3 MB
Average Row Size in Memory 252.5 B
Variable Types
  • Numerical: 15
  • Categorical: 3

Dataset Insights

No is uniformly distributed Uniform
PM2.5 has 8739 (2.08%) missing values Missing
PM10 has 6449 (1.53%) missing values Missing
SO2 has 9021 (2.14%) missing values Missing
NO2 has 12116 (2.88%) missing values Missing
CO has 20701 (4.92%) missing values Missing
O3 has 13277 (3.16%) missing values Missing
PM2.5 is skewed Skewed
PM10 is skewed Skewed
SO2 is skewed Skewed
CO is skewed Skewed
O3 is skewed Skewed
RAIN is skewed Skewed
WSPM is skewed Skewed
year has constant length 4 Constant Length
TEMP has 64249 (15.27%) negatives Negatives
DEWP has 185393 (44.06%) negatives Negatives
RAIN has 403858 (95.98%) zeros Zeros
  • 1
  • 2

Variables


No

numerical

Approximate Distinct Count 35064
Approximate Unique (%) 8.3%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 6.4 MB
Mean 17532.5
Minimum 1
Maximum 35064
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • No is uniformly distributed

Quantile Statistics

Minimum 1
5-th Percentile 1754.15
Q1 8766.75
Median 17532.5
Q3 26298.25
95-th Percentile 33310.85
Maximum 35064
Range 35063
IQR 17531.5

Descriptive Statistics

Mean 17532.5
Standard Deviation 10122.1169
Variance 1.0246e+08
Sum 7.3771e+09
Skewness 0
Kurtosis -1.2
Coefficient of Variation 0.5773
  • No is not normally distributed (p-value 0.0009073756540153908)

year

categorical

Approximate Distinct Count 5
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 27.7 MB

Length

Mean 4
Standard Deviation 0
Median 4
Minimum 4
Maximum 4

Sample

1st row 2013
2nd row 2013
3rd row 2013
4th row 2013
5th row 2013

Letter

Count 0
Lowercase Letter 0
Space Separator 0
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 1683072
  • The top 2 categories (2016, 2014) take over 50.0%
  • year has words of constant length

month

numerical

Approximate Distinct Count 12
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 6.4 MB
Mean 6.5229
Minimum 1
Maximum 12
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • month is skewed left (γ1 = -0.0093)

Quantile Statistics

Minimum 1
5-th Percentile 1
Q1 4
Median 7
Q3 10
95-th Percentile 12
Maximum 12
Range 11
IQR 6

Descriptive Statistics

Mean 6.5229
Standard Deviation 3.4487
Variance 11.8936
Sum 2.7446e+06
Skewness -0.009294
Kurtosis -1.2081
Coefficient of Variation 0.5287
  • month is not normally distributed (p-value 0.0034789675643796935)

day

numerical

Approximate Distinct Count 31
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 6.4 MB
Mean 15.7296
Minimum 1
Maximum 31
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • day is skewed right (γ1 = 0.0068)

Quantile Statistics

Minimum 1
5-th Percentile 2
Q1 8
Median 16
Q3 23
95-th Percentile 29
Maximum 31
Range 30
IQR 15

Descriptive Statistics

Mean 15.7296
Standard Deviation 8.8001
Variance 77.4418
Sum 6.6185e+06
Skewness 0.00676
Kurtosis -1.194
Coefficient of Variation 0.5595
  • day is not normally distributed (p-value 8.527605992859291e-66)

hour

numerical

Approximate Distinct Count 24
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 6.4 MB
Mean 11.5
Minimum 0
Maximum 23
Zeros 17532
Zeros (%) 4.2%
Negatives 0
Negatives (%) 0.0%

Quantile Statistics

Minimum 0
5-th Percentile 1
Q1 5.75
Median 11.5
Q3 17.25
95-th Percentile 22
Maximum 23
Range 23
IQR 11.5

Descriptive Statistics

Mean 11.5
Standard Deviation 6.9222
Variance 47.9168
Sum 4.8388e+06
Skewness 0
Kurtosis -1.2042
Coefficient of Variation 0.6019
  • hour is not normally distributed (p-value 8.53060929360839e-198)

PM2.5

numerical

Approximate Distinct Count 888
Approximate Unique (%) 0.2%
Missing 8739
Missing (%) 2.1%
Infinite 0
Infinite (%) 0.0%
Memory Size 6.3 MB
Mean 79.7934
Minimum 2
Maximum 999
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • PM2.5 is skewed right (γ1 = 2.014)

Quantile Statistics

Minimum 2
5-th Percentile 7
Q1 21
Median 56
Q3 112.9
95-th Percentile 247
Maximum 999
Range 997
IQR 91.9

Descriptive Statistics

Mean 79.7934
Standard Deviation 80.8224
Variance 6532.2589
Sum 3.2877e+07
Skewness 2.014
Kurtosis 5.9647
Coefficient of Variation 1.0129
  • PM2.5 is not normally distributed (p-value 3.4046266802673274e-14)
  • PM2.5 has 18436 outliers

PM10

numerical

Approximate Distinct Count 1084
Approximate Unique (%) 0.3%
Missing 6449
Missing (%) 1.5%
Infinite 0
Infinite (%) 0.0%
Memory Size 6.3 MB
Mean 104.6026
Minimum 2
Maximum 999
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • PM10 is skewed right (γ1 = 1.8858)

Quantile Statistics

Minimum 2
5-th Percentile 11
Q1 36
Median 83
Q3 147
95-th Percentile 289
Maximum 999
Range 997
IQR 111

Descriptive Statistics

Mean 104.6026
Standard Deviation 91.7724
Variance 8422.1781
Sum 4.3339e+07
Skewness 1.8858
Kurtosis 6.1855
Coefficient of Variation 0.8773
  • PM10 is not normally distributed (p-value 1.5836592303736184e-07)
  • PM10 has 13829 outliers

SO2

numerical

Approximate Distinct Count 691
Approximate Unique (%) 0.2%
Missing 9021
Missing (%) 2.1%
Infinite 0
Infinite (%) 0.0%
Memory Size 6.3 MB
Mean 15.8308
Minimum 0.2856
Maximum 500
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • SO2 is skewed right (γ1 = 3.0095)

Quantile Statistics

Minimum 0.2856
5-th Percentile 2
Q1 3
Median 7.14
Q3 20
95-th Percentile 62
Maximum 500
Range 499.7144
IQR 17

Descriptive Statistics

Mean 15.8308
Standard Deviation 21.6506
Variance 468.7486
Sum 6.5183e+06
Skewness 3.0095
Kurtosis 14.1877
Coefficient of Variation 1.3676
  • SO2 is not normally distributed (p-value 5.757588592336091e-23)
  • SO2 has 35566 outliers

NO2

numerical

Approximate Distinct Count 1212
Approximate Unique (%) 0.3%
Missing 12116
Missing (%) 2.9%
Infinite 0
Infinite (%) 0.0%
Memory Size 6.2 MB
Mean 50.6386
Minimum 1.0265
Maximum 290
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • NO2 is skewed right (γ1 = 1.0501)

Quantile Statistics

Minimum 1.0265
5-th Percentile 8
Q1 23
Median 44
Q3 72
95-th Percentile 119
Maximum 290
Range 288.9735
IQR 49

Descriptive Statistics

Mean 50.6386
Standard Deviation 35.1279
Variance 1233.9702
Sum 2.0694e+07
Skewness 1.0501
Kurtosis 1.2038
Coefficient of Variation 0.6937
  • NO2 is not normally distributed (p-value 0.008246460055394679)
  • NO2 has 6475 outliers

CO

numerical

Approximate Distinct Count 132
Approximate Unique (%) 0.0%
Missing 20701
Missing (%) 4.9%
Infinite 0
Infinite (%) 0.0%
Memory Size 6.1 MB
Mean 1230.7665
Minimum 100
Maximum 10000
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • CO is skewed right (γ1 = 2.5701)

Quantile Statistics

Minimum 100
5-th Percentile 200
Q1 500
Median 900
Q3 1500
95-th Percentile 3600
Maximum 10000
Range 9900
IQR 1000

Descriptive Statistics

Mean 1230.7665
Standard Deviation 1160.1827
Variance 1.346e+06
Sum 4.9239e+08
Skewness 2.5701
Kurtosis 9.3192
Coefficient of Variation 0.9427
  • CO is not normally distributed (p-value 3.978714308133203e-09)
  • CO has 28054 outliers

O3

numerical

Approximate Distinct Count 1598
Approximate Unique (%) 0.4%
Missing 13277
Missing (%) 3.2%
Infinite 0
Infinite (%) 0.0%
Memory Size 6.2 MB
Mean 57.3723
Minimum 0.2142
Maximum 1071
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • O3 is skewed right (γ1 = 1.6601)

Quantile Statistics

Minimum 0.2142
5-th Percentile 2
Q1 11
Median 45
Q3 83
95-th Percentile 182
Maximum 1071
Range 1070.7858
IQR 72

Descriptive Statistics

Mean 57.3723
Standard Deviation 56.6616
Variance 3210.5378
Sum 2.3379e+07
Skewness 1.6601
Kurtosis 6.2641
Coefficient of Variation 0.9876
  • O3 is not normally distributed (p-value 2.0984867233117432e-16)
  • O3 has 15672 outliers

TEMP

numerical

Approximate Distinct Count 2034
Approximate Unique (%) 0.5%
Missing 398
Missing (%) 0.1%
Infinite 0
Infinite (%) 0.0%
Memory Size 6.4 MB
Mean 13.539
Minimum -19.9
Maximum 41.6
Zeros 2739
Zeros (%) 0.7%
Negatives 64249
Negatives (%) 15.3%
  • TEMP is skewed left (γ1 = -0.1043)

Quantile Statistics

Minimum -19.9
5-th Percentile -4
Q1 3.3
Median 14.7
Q3 23.5
95-th Percentile 30.7
Maximum 41.6
Range 61.5
IQR 20.2

Descriptive Statistics

Mean 13.539
Standard Deviation 11.4361
Variance 130.7853
Sum 5.6914e+06
Skewness -0.1043
Kurtosis -1.1433
Coefficient of Variation 0.8447
  • TEMP is not normally distributed (p-value 1.1271891112294452e-07)

PRES

numerical

Approximate Distinct Count 726
Approximate Unique (%) 0.2%
Missing 393
Missing (%) 0.1%
Infinite 0
Infinite (%) 0.0%
Memory Size 6.4 MB
Mean 1010.747
Minimum 982.4
Maximum 1042.8
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • PRES is skewed right (γ1 = 0.1063)

Quantile Statistics

Minimum 982.4
5-th Percentile 994.972
Q1 1002.4
Median 1010.5
Q3 1019.1
95-th Percentile 1028
Maximum 1042.8
Range 60.4
IQR 16.7

Descriptive Statistics

Mean 1010.747
Standard Deviation 10.4741
Variance 109.7058
Sum 4.2489e+08
Skewness 0.1063
Kurtosis -0.8292
Coefficient of Variation 0.01036
  • PRES is not normally distributed (p-value 9.551963602752667e-30)

DEWP

numerical

Approximate Distinct Count 645
Approximate Unique (%) 0.2%
Missing 403
Missing (%) 0.1%
Infinite 0
Infinite (%) 0.0%
Memory Size 6.4 MB
Mean 2.4908
Minimum -43.4
Maximum 29.1
Zeros 832
Zeros (%) 0.2%
Negatives 185393
Negatives (%) 44.1%
  • DEWP is skewed left (γ1 = -0.1877)

Quantile Statistics

Minimum -43.4
5-th Percentile -19.4
Q1 -8.8
Median 3.3
Q3 15.3
95-th Percentile 22.3
Maximum 29.1
Range 72.5
IQR 24.1

Descriptive Statistics

Mean 2.4908
Standard Deviation 13.7938
Variance 190.2702
Sum 1.0471e+06
Skewness -0.1877
Kurtosis -1.1322
Coefficient of Variation 5.5379
  • DEWP is not normally distributed (p-value 0.00014393967116586067)

RAIN

numerical

Approximate Distinct Count 253
Approximate Unique (%) 0.1%
Missing 390
Missing (%) 0.1%
Infinite 0
Infinite (%) 0.0%
Memory Size 6.4 MB
Mean 0.06448
Minimum 0
Maximum 72.5
Zeros 403858
Zeros (%) 96.0%
Negatives 0
Negatives (%) 0.0%
  • RAIN is skewed right (γ1 = 30.0435)

Quantile Statistics

Minimum 0
5-th Percentile 0
Q1 0
Median 0
Q3 0
95-th Percentile 0
Maximum 72.5
Range 72.5
IQR 0

Descriptive Statistics

Mean 0.06448
Standard Deviation 0.821
Variance 0.674
Sum 27104.2
Skewness 30.0435
Kurtosis 1345.4901
Coefficient of Variation 12.7335
  • RAIN is not normally distributed (p-value 4.232057658358454e-25)
  • RAIN has 16520 outliers

wd

categorical

Approximate Distinct Count 16
Approximate Unique (%) 0.0%
Missing 1822
Missing (%) 0.4%
Memory Size 26.9 MB

Length

Mean 2.2363
Standard Deviation 0.804
Median 2
Minimum 1
Maximum 3

Sample

1st row NNW
2nd row N
3rd row NNW
4th row NW
5th row N

Letter

Count 936895
Lowercase Letter 0
Space Separator 0
Uppercase Letter 936895
Dash Punctuation 0
Decimal Number 0

WSPM

numerical

Approximate Distinct Count 117
Approximate Unique (%) 0.0%
Missing 318
Missing (%) 0.1%
Infinite 0
Infinite (%) 0.0%
Memory Size 6.4 MB
Mean 1.7297
Minimum 0
Maximum 13.2
Zeros 11118
Zeros (%) 2.6%
Negatives 0
Negatives (%) 0.0%
  • WSPM is skewed right (γ1 = 1.6256)

Quantile Statistics

Minimum 0
5-th Percentile 0.4
Q1 0.9
Median 1.4
Q3 2.2
95-th Percentile 4.4
Maximum 13.2
Range 13.2
IQR 1.3

Descriptive Statistics

Mean 1.7297
Standard Deviation 1.2464
Variance 1.5535
Sum 727256.8
Skewness 1.6256
Kurtosis 3.6844
Coefficient of Variation 0.7206
  • WSPM is not normally distributed (p-value 6.0087785715828255e-09)
  • WSPM has 23079 outliers

station

categorical

Approximate Distinct Count 12
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 29.5 MB

Length

Mean 8.4167
Standard Deviation 2.431
Median 7
Minimum 6
Maximum 13

Sample

1st row Aotizhongxin
2nd row Aotizhongxin
3rd row Aotizhongxin
4th row Aotizhongxin
5th row Aotizhongxin

Letter

Count 3541464
Lowercase Letter 3120696
Space Separator 0
Uppercase Letter 420768
Dash Punctuation 0
Decimal Number 0

Interactions

Correlations

Missing Values